智能论文笔记

Accelerated and Quantitative 3D Semisolid MT/CEST Imaging using a Generative Adversarial Network (GAN-CEST)

Jonah Weigand-Whittier , Maria Sedykh , Kai Herz , Jaume Coll-Font , Anna N. Foster , Elizabeth R. Gerstner , Christopher Nguyen , Moritz Zaiss , Christian T. Farrar , Or Perlman

分类：机器学习

2022-07-22

目的：大大缩短定量3D化学交换饱和转移（CEST）和半固体磁化转移（MT）成像所需的采集时间，并允许快速化学交换参数图重建。方法：三维CEST和MT磁共振指纹（MRF）数据集的L-精氨酸幻象，全脑，全脑和小腿肌肉的健康志愿者，癌症患者和心脏病患者是使用3T临床扫描仪在3T不同的位点使用3T临床扫描仪获得的3种不同的扫描仪模型和线圈。然后，设计和训练了一个生成的对抗网络监督框架（GAN-CEST），以学习从减少的输入数据空间到定量交换参数空间的映射，同时保留感知和定量内容。结果：GAN-CEST 3D采集时间为42-52秒，比CEST-MRF短70％。整个大脑的定量重建需要0.8秒。在地面真相和基于GAN的L-精氨酸浓度和pH值之间观察到了极好的一致性（Pearson的R> 0.97，NRMSE <1.5％）。来自脑肿瘤受试者的gan-cest图像产生的半固体量分数和汇率NRMSE为3.8 $ \ pm $ 1.3％和4.6 $ \ pm $ 1.3％，SSIM和96.3 $ \ pm $ \ pm $ 1.6％和95.0 $ \ pm $ 2.4％。半固体交换参数的NRMSE <7％和SSIM> 94％的小腿肌肉交换参数的映射。与MRF相比，在具有较大敏感性伪像的区域中，Gan-Cest表现出改善的性能和噪声降低。结论：Gan-Cest可以大大减少定量半固体MT/CEST映射的获取时间，同时即使在训练过程中无法使用的病理和扫描仪模型时，也可以保持性能。

translated by 谷歌翻译

Connected Reconfiguration of Polyominoes Amid Obstacles using RRT*

Javier Garcia , Michael Yannuzzi , Peter Kramer , Christian Rieck , Aaron T. Becker

分类：机器人

2022-07-04

本文使用基于采样的方法RRT*研究，以在复杂的环境中重新配置一组连接的瓷砖，在这些环境中可能存在多个障碍。由于目标应用程序是自动构建离散的自动构建，因此使用移动机器人进行了蜂窝结构，因此有一些限制可以确定可以拾取哪些图块以及在重新配置期间可以将其放下的块。我们将我们的方法与两种算法作为全球和本地计划者进行了比较，并表明我们能够在具有不同程度的障碍空间的环境中使用合理数量的样本找到更有效的构建序列。

translated by 谷歌翻译

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava , Abhinav Rastogi , Abhishek Rao , Abu Awal Md Shoeb , Abubakar Abid , Adam Fisch , Adam R. Brown , Adam Santoro , Aditya Gupta , Adrià Garriga-Alonso

分类：自然语言处理 | 人工智能 | 机器学习 | (统计)机器学习

2022-06-09

语言模型既展示了定量的改进，又展示了新的定性功能，随着规模的增加。尽管它们具有潜在的变革性影响，但这些新能力的特征却很差。为了为未来的研究提供信息，为破坏性的新模型能力做准备，并改善社会有害的效果，至关重要的是，我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战，我们介绍了超越模仿游戏基准（Big Bench）。 Big Bench目前由204个任务组成，由132家机构的442位作者贡献。任务主题是多样的，从语言学，儿童发展，数学，常识性推理，生物学，物理学，社会偏见，软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号，Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为，跨越了数百万到数十亿个参数。此外，一个人类专家评估者团队执行了所有任务，以提供强大的基准。研究结果包括：模型性能和校准都随规模改善，但绝对的术语（以及与评估者的性能相比）；在模型类中的性能非常相似，尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分，而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标；社交偏见通常会随着含糊不清的环境而随着规模而增加，但这可以通过提示来改善。

translated by 谷歌翻译

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

Kaustubh D. Dhole , Varun Gangal , Sebastian Gehrmann , Aadesh Gupta , Zhenhao Li , Saad Mahamood , Abinaya Mahendiran , Simon Mille , Ashish Srivastava , Samson Tan

分类：自然语言处理 | 人工智能 | 机器学习

2021-12-06

数据增强是自然语言处理（NLP）模型的鲁棒性评估的重要组成部分，以及增强他们培训的数据的多样性。在本文中，我们呈现NL-Cogmenter，这是一种新的参与式Python的自然语言增强框架，它支持创建两个转换（对数据的修改）和过滤器（根据特定功能的数据拆分）。我们描述了框架和初始的117个变换和23个过滤器，用于各种自然语言任务。我们通过使用其几个转换来分析流行自然语言模型的鲁棒性来证明NL-Upmenter的功效。基础架构，Datacards和稳健性分析结果在NL-Augmenter存储库上公开可用（\ url {https://github.com/gem-benchmark/nl-augmenter}）。

translated by 谷歌翻译

Advances in Neural Rendering

Ayush Tewari , Justus Thies , Ben Mildenhall , Pratul Srinivasan , Edgar Tretschk , Yifan Wang , Christoph Lassner , Vincent Sitzmann , Ricardo Martin-Brualla , Stephen Lombardi

分类：计算机视觉

2021-11-10

综合照片 - 现实图像和视频是计算机图形的核心，并且是几十年的研究焦点。传统上，使用渲染算法（如光栅化或射线跟踪）生成场景的合成图像，其将几何形状和材料属性的表示为输入。统称，这些输入定义了实际场景和呈现的内容，并且被称为场景表示（其中场景由一个或多个对象组成）。示例场景表示是具有附带纹理的三角形网格（例如，由艺术家创建），点云（例如，来自深度传感器），体积网格（例如，来自CT扫描）或隐式曲面函数（例如，截短的符号距离）字段）。使用可分辨率渲染损耗的观察结果的这种场景表示的重建被称为逆图形或反向渲染。神经渲染密切相关，并将思想与经典计算机图形和机器学习中的思想相结合，以创建用于合成来自真实观察图像的图像的算法。神经渲染是朝向合成照片现实图像和视频内容的目标的跨越。近年来，我们通过数百个出版物显示了这一领域的巨大进展，这些出版物显示了将被动组件注入渲染管道的不同方式。这种最先进的神经渲染进步的报告侧重于将经典渲染原则与学习的3D场景表示结合的方法，通常现在被称为神经场景表示。这些方法的一个关键优势在于它们是通过设计的3D-一致，使诸如新颖的视点合成捕获场景的应用。除了处理静态场景的方法外，我们还涵盖了用于建模非刚性变形对象的神经场景表示...

translated by 谷歌翻译

Flexible Supervised Autonomy for Exploration in Subterranean Environments

Harel Biggie , Eugene R. Rush , Danny G. Riley , Shakeeb Ahmad , Michael T. Ohradzansky , Kyle Harlow , Michael J. Miles , Daniel Torres , Steve McGuire , Eric W. Frew

分类：机器人

2023-01-02

While the capabilities of autonomous systems have been steadily improving in recent years, these systems still struggle to rapidly explore previously unknown environments without the aid of GPS-assisted navigation. The DARPA Subterranean (SubT) Challenge aimed to fast track the development of autonomous exploration systems by evaluating their performance in real-world underground search-and-rescue scenarios. Subterranean environments present a plethora of challenges for robotic systems, such as limited communications, complex topology, visually-degraded sensing, and harsh terrain. The presented solution enables long-term autonomy with minimal human supervision by combining a powerful and independent single-agent autonomy stack, with higher level mission management operating over a flexible mesh network. The autonomy suite deployed on quadruped and wheeled robots was fully independent, freeing the human supervision to loosely supervise the mission and make high-impact strategic decisions. We also discuss lessons learned from fielding our system at the SubT Final Event, relating to vehicle versatility, system adaptability, and re-configurable communications.

translated by 谷歌翻译

Bimanual Telemanipulation with Force and Haptic Feedback through an Anthropomorphic Avatar System

Christian Lenz , Sven Behnke

分类：机器人

2023-01-02

Robotic teleoperation is a key technology for a wide variety of applications. It allows sending robots instead of humans in remote, possibly dangerous locations while still using the human brain with its enormous knowledge and creativity, especially for solving unexpected problems. A main challenge in teleoperation consists of providing enough feedback to the human operator for situation awareness and thus create full immersion, as well as offering the operator suitable control interfaces to achieve efficient and robust task fulfillment. We present a bimanual telemanipulation system consisting of an anthropomorphic avatar robot and an operator station providing force and haptic feedback to the human operator. The avatar arms are controlled in Cartesian space with a direct mapping of the operator movements. The measured forces and torques on the avatar side are haptically displayed to the operator. We developed a predictive avatar model for limit avoidance which runs on the operator side, ensuring low latency. The system was successfully evaluated during the ANA Avatar XPRIZE competition semifinals. In addition, we performed in lab experiments and carried out a small user study with mostly untrained operators.

translated by 谷歌翻译

Muse: Text-To-Image Generation via Masked Generative Transformers

Huiwen Chang , Han Zhang , Jarred Barber , AJ Maschinot , Jose Lezama , Lu Jiang , Ming-Hsuan Yang , Kevin Murphy , William T. Freeman , Michael Rubinstein

分类：计算机视觉 | 人工智能 | 机器学习

2023-01-02

We present Muse, a text-to-image Transformer model that achieves state-of-the-art image generation performance while being significantly more efficient than diffusion or autoregressive models. Muse is trained on a masked modeling task in discrete token space: given the text embedding extracted from a pre-trained large language model (LLM), Muse is trained to predict randomly masked image tokens. Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding. The use of a pre-trained LLM enables fine-grained language understanding, translating to high-fidelity image generation and the understanding of visual concepts such as objects, their spatial relationships, pose, cardinality etc. Our 900M parameter model achieves a new SOTA on CC3M, with an FID score of 6.06. The Muse 3B parameter model achieves an FID of 7.88 on zero-shot COCO evaluation, along with a CLIP score of 0.32. Muse also directly enables a number of image editing applications without the need to fine-tune or invert the model: inpainting, outpainting, and mask-free editing. More results are available at https://muse-model.github.io

translated by 谷歌翻译

Spectral Bandwidth Recovery of Optical Coherence Tomography Images using Deep Learning

Timothy T. Yu , Da Ma , Jayden Cole , Myeong Jin Ju , Mirza F. Beg , Marinko V. Sarunic

分类：人工智能 | 计算机视觉

2023-01-02

Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases. Technological developments to increase the speed of acquisition often results in systems with a narrower spectral bandwidth, and hence a lower axial resolution. Traditionally, image-processing-based techniques have been utilized to reconstruct subsampled OCT data and more recently, deep-learning-based methods have been explored. In this study, we simulate reduced axial scan (A-scan) resolution by Gaussian windowing in the spectral domain and investigate the use of a learning-based approach for image feature reconstruction. In anticipation of the reduced resolution that accompanies wide-field OCT systems, we build upon super-resolution techniques to explore methods to better aid clinicians in their decision-making to improve patient outcomes, by reconstructing lost features using a pixel-to-pixel approach with an altered super-resolution generative adversarial network (SRGAN) architecture.

translated by 谷歌翻译

Design, Modeling, and Evaluation of Separable Tendon-Driven Robotic Manipulator with Long, Passive, Flexible Proximal Section

Christian DeBuys , Florin C. Ghesu , Jagadeesan Jayender , Reza Langari , Young-Ho Kim

分类：机器人

2023-01-01

The purpose of this work was to tackle practical issues which arise when using a tendon-driven robotic manipulator with a long, passive, flexible proximal section in medical applications. A separable robot which overcomes difficulties in actuation and sterilization is introduced, in which the body containing the electronics is reusable and the remainder is disposable. A control input which resolves the redundancy in the kinematics and a physical interpretation of this redundancy are provided. The effect of a static change in the proximal section angle on bending angle error was explored under four testing conditions for a sinusoidal input. Bending angle error increased for increasing proximal section angle for all testing conditions with an average error reduction of 41.48% for retension, 4.28% for hysteresis, and 52.35% for re-tension + hysteresis compensation relative to the baseline case. Two major sources of error in tracking the bending angle were identified: time delay from hysteresis and DC offset from the proximal section angle. Examination of these error sources revealed that the simple hysteresis compensation was most effective for removing time delay and re-tension compensation for removing DC offset, which was the primary source of increasing error. The re-tension compensation was also tested for dynamic changes in the proximal section and reduced error in the final configuration of the tip by 89.14% relative to the baseline case.

translated by 谷歌翻译